40 research outputs found
The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning
The nascent field of fair machine learning aims to ensure that decisions
guided by algorithms are equitable. Over the last several years, three formal
definitions of fairness have gained prominence: (1) anti-classification,
meaning that protected attributes---like race, gender, and their proxies---are
not explicitly used to make decisions; (2) classification parity, meaning that
common measures of predictive performance (e.g., false positive and false
negative rates) are equal across groups defined by the protected attributes;
and (3) calibration, meaning that conditional on risk estimates, outcomes are
independent of protected attributes. Here we show that all three of these
fairness definitions suffer from significant statistical limitations. Requiring
anti-classification or classification parity can, perversely, harm the very
groups they were designed to protect; and calibration, though generally
desirable, provides little guarantee that decisions are equitable. In contrast
to these formal fairness criteria, we argue that it is often preferable to
treat similarly risky people similarly, based on the most statistically
accurate estimates of risk that one can produce. Such a strategy, while not
universally applicable, often aligns well with policy objectives; notably, this
strategy will typically violate both anti-classification and classification
parity. In practice, it requires significant effort to construct suitable risk
estimates. One must carefully define and measure the targets of prediction to
avoid retrenching biases in the data. But, importantly, one cannot generally
address these difficulties by requiring that algorithms satisfy popular
mathematical formalizations of fairness. By highlighting these challenges in
the foundation of fair machine learning, we hope to help researchers and
practitioners productively advance the area
Matched Pair Calibration for Ranking Fairness
We propose a test of fairness in score-based ranking systems called matched
pair calibration. Our approach constructs a set of matched item pairs with
minimal confounding differences between subgroups before computing an
appropriate measure of ranking error over the set. The matching step ensures
that we compare subgroup outcomes between identically scored items so that
measured performance differences directly imply unfairness in subgroup-level
exposures. We show how our approach generalizes the fairness intuitions of
calibration from a binary classification setting to ranking and connect our
approach to other proposals for ranking fairness measures. Moreover, our
strategy shows how the logic of marginal outcome tests extends to cases where
the analyst has access to model scores. Lastly, we provide an example of
applying matched pair calibration to a real-word ranking data set to
demonstrate its efficacy in detecting ranking bias.Comment: 19 pages, 8 figure
POTs: Protective Optimization Technologies
Algorithmic fairness aims to address the economic, moral, social, and
political impact that digital systems have on populations through solutions
that can be applied by service providers. Fairness frameworks do so, in part,
by mapping these problems to a narrow definition and assuming the service
providers can be trusted to deploy countermeasures. Not surprisingly, these
decisions limit fairness frameworks' ability to capture a variety of harms
caused by systems.
We characterize fairness limitations using concepts from requirements
engineering and from social sciences. We show that the focus on algorithms'
inputs and outputs misses harms that arise from systems interacting with the
world; that the focus on bias and discrimination omits broader harms on
populations and their environments; and that relying on service providers
excludes scenarios where they are not cooperative or intentionally adversarial.
We propose Protective Optimization Technologies (POTs). POTs provide means
for affected parties to address the negative impacts of systems in the
environment, expanding avenues for political contestation. POTs intervene from
outside the system, do not require service providers to cooperate, and can
serve to correct, shift, or expose harms that systems impose on populations and
their environments. We illustrate the potential and limitations of POTs in two
case studies: countering road congestion caused by traffic-beating
applications, and recalibrating credit scoring for loan applicants.Comment: Appears in Conference on Fairness, Accountability, and Transparency
(FAT* 2020). Bogdan Kulynych and Rebekah Overdorf contributed equally to this
work. Version v1/v2 by Seda G\"urses, Rebekah Overdorf, and Ero Balsa was
presented at HotPETS 2018 and at PiMLAI 201
The Changing Landscape for Stroke\ua0Prevention in AF: Findings From the GLORIA-AF Registry Phase 2
Background GLORIA-AF (Global Registry on Long-Term Oral Antithrombotic Treatment in Patients with Atrial Fibrillation) is a prospective, global registry program describing antithrombotic treatment patterns in patients with newly diagnosed nonvalvular atrial fibrillation at risk of stroke. Phase 2 began when dabigatran, the first non\u2013vitamin K antagonist oral anticoagulant (NOAC), became available. Objectives This study sought to describe phase 2 baseline data and compare these with the pre-NOAC era collected during phase 1. Methods During phase 2, 15,641 consenting patients were enrolled (November 2011 to December 2014); 15,092 were eligible. This pre-specified cross-sectional analysis describes eligible patients\u2019 baseline characteristics. Atrial fibrillation disease characteristics, medical outcomes, and concomitant diseases and medications were collected. Data were analyzed using descriptive statistics. Results Of the total patients, 45.5% were female; median age was 71 (interquartile range: 64, 78) years. Patients were from Europe (47.1%), North America (22.5%), Asia (20.3%), Latin America (6.0%), and the Middle East/Africa (4.0%). Most had high stroke risk (CHA2DS2-VASc [Congestive heart failure, Hypertension, Age 6575 years, Diabetes mellitus, previous Stroke, Vascular disease, Age 65 to 74 years, Sex category] score 652; 86.1%); 13.9% had moderate risk (CHA2DS2-VASc = 1). Overall, 79.9% received oral anticoagulants, of whom 47.6% received NOAC and 32.3% vitamin K antagonists (VKA); 12.1% received antiplatelet agents; 7.8% received no antithrombotic treatment. For comparison, the proportion of phase 1 patients (of N = 1,063 all eligible) prescribed VKA was 32.8%, acetylsalicylic acid 41.7%, and no therapy 20.2%. In Europe in phase 2, treatment with NOAC was more common than VKA (52.3% and 37.8%, respectively); 6.0% of patients received antiplatelet treatment; and 3.8% received no antithrombotic treatment. In North America, 52.1%, 26.2%, and 14.0% of patients received NOAC, VKA, and antiplatelet drugs, respectively; 7.5% received no antithrombotic treatment. NOAC use was less common in Asia (27.7%), where 27.5% of patients received VKA, 25.0% antiplatelet drugs, and 19.8% no antithrombotic treatment. Conclusions The baseline data from GLORIA-AF phase 2 demonstrate that in newly diagnosed nonvalvular atrial fibrillation patients, NOAC have been highly adopted into practice, becoming more frequently prescribed than VKA in Europe and North America. Worldwide, however, a large proportion of patients remain undertreated, particularly in Asia and North America. (Global Registry on Long-Term Oral Antithrombotic Treatment in Patients With Atrial Fibrillation [GLORIA-AF]; NCT01468701
Effect of angiotensin-converting enzyme inhibitor and angiotensin receptor blocker initiation on organ support-free days in patients hospitalized with COVID-19
IMPORTANCE Overactivation of the renin-angiotensin system (RAS) may contribute to poor clinical outcomes in patients with COVID-19.
Objective To determine whether angiotensin-converting enzyme (ACE) inhibitor or angiotensin receptor blocker (ARB) initiation improves outcomes in patients hospitalized for COVID-19.
DESIGN, SETTING, AND PARTICIPANTS In an ongoing, adaptive platform randomized clinical trial, 721 critically ill and 58 nonâcritically ill hospitalized adults were randomized to receive an RAS inhibitor or control between March 16, 2021, and February 25, 2022, at 69 sites in 7 countries (final follow-up on June 1, 2022).
INTERVENTIONS Patients were randomized to receive open-label initiation of an ACE inhibitor (nâ=â257), ARB (nâ=â248), ARB in combination with DMX-200 (a chemokine receptor-2 inhibitor; nâ=â10), or no RAS inhibitor (control; nâ=â264) for up to 10 days.
MAIN OUTCOMES AND MEASURES The primary outcome was organ supportâfree days, a composite of hospital survival and days alive without cardiovascular or respiratory organ support through 21 days. The primary analysis was a bayesian cumulative logistic model. Odds ratios (ORs) greater than 1 represent improved outcomes.
RESULTS On February 25, 2022, enrollment was discontinued due to safety concerns. Among 679 critically ill patients with available primary outcome data, the median age was 56 years and 239 participants (35.2%) were women. Median (IQR) organ supportâfree days among critically ill patients was 10 (â1 to 16) in the ACE inhibitor group (nâ=â231), 8 (â1 to 17) in the ARB group (nâ=â217), and 12 (0 to 17) in the control group (nâ=â231) (median adjusted odds ratios of 0.77 [95% bayesian credible interval, 0.58-1.06] for improvement for ACE inhibitor and 0.76 [95% credible interval, 0.56-1.05] for ARB compared with control). The posterior probabilities that ACE inhibitors and ARBs worsened organ supportâfree days compared with control were 94.9% and 95.4%, respectively. Hospital survival occurred in 166 of 231 critically ill participants (71.9%) in the ACE inhibitor group, 152 of 217 (70.0%) in the ARB group, and 182 of 231 (78.8%) in the control group (posterior probabilities that ACE inhibitor and ARB worsened hospital survival compared with control were 95.3% and 98.1%, respectively).
CONCLUSIONS AND RELEVANCE In this trial, among critically ill adults with COVID-19, initiation of an ACE inhibitor or ARB did not improve, and likely worsened, clinical outcomes.
TRIAL REGISTRATION ClinicalTrials.gov Identifier: NCT0273570